In [1]:
In [2]:
Out[2]:
Unnamed: 0 carat cut color clarity depth table price x y z
0 1 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43
1 2 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31
2 3 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31
3 4 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63
4 5 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75
... ... ... ... ... ... ... ... ... ... ... ...
53935 53936 0.72 Ideal D SI1 60.8 57.0 2757 5.75 5.76 3.50
53936 53937 0.72 Good D SI1 63.1 55.0 2757 5.69 5.75 3.61
53937 53938 0.70 Very Good D SI1 62.8 60.0 2757 5.66 5.68 3.56
53938 53939 0.86 Premium H SI2 61.0 58.0 2757 6.15 6.12 3.74
53939 53940 0.75 Ideal D SI2 62.2 55.0 2757 5.83 5.87 3.64

53940 rows × 11 columns

In [3]:
   Unnamed: 0  carat      cut color clarity  depth  table  price     x     y  \
0           1   0.23    Ideal     E     SI2   61.5   55.0    326  3.95  3.98   
1           2   0.21  Premium     E     SI1   59.8   61.0    326  3.89  3.84   
2           3   0.23     Good     E     VS1   56.9   65.0    327  4.05  4.07   
3           4   0.29  Premium     I     VS2   62.4   58.0    334  4.20  4.23   
4           5   0.31     Good     J     SI2   63.3   58.0    335  4.34  4.35   

      z  
0  2.43  
1  2.31  
2  2.31  
3  2.63  
4  2.75  
In [5]:

Now let’s start analyzing diamond prices. I will first analyze the relationship between the carat and the price of the diamond to see how the number of carats affects the price of a diamond:

In [6]:
01234505k10k15k20k25k30k
cutIdealPremiumGoodVery GoodFaircaratprice

We can see a linear relationship between the number of carats and the price of a diamond. It means higher carats result in higher prices.

Now I will add a new column to this dataset by calculating the size (length x width x depth) of the diamond:

In [7]:
       Unnamed: 0  carat        cut color clarity  depth  table  price     x  \
0               1   0.23      Ideal     E     SI2   61.5   55.0    326  3.95   
1               2   0.21    Premium     E     SI1   59.8   61.0    326  3.89   
2               3   0.23       Good     E     VS1   56.9   65.0    327  4.05   
3               4   0.29    Premium     I     VS2   62.4   58.0    334  4.20   
4               5   0.31       Good     J     SI2   63.3   58.0    335  4.34   
...           ...    ...        ...   ...     ...    ...    ...    ...   ...   
53935       53936   0.72      Ideal     D     SI1   60.8   57.0   2757  5.75   
53936       53937   0.72       Good     D     SI1   63.1   55.0   2757  5.69   
53937       53938   0.70  Very Good     D     SI1   62.8   60.0   2757  5.66   
53938       53939   0.86    Premium     H     SI2   61.0   58.0   2757  6.15   
53939       53940   0.75      Ideal     D     SI2   62.2   55.0   2757  5.83   

          y     z        size  
0      3.98  2.43   38.202030  
1      3.84  2.31   34.505856  
2      4.07  2.31   38.076885  
3      4.23  2.63   46.724580  
4      4.35  2.75   51.917250  
...     ...   ...         ...  
53935  5.76  3.50  115.920000  
53936  5.75  3.61  118.110175  
53937  5.68  3.56  114.449728  
53938  6.12  3.74  140.766120  
53939  5.87  3.64  124.568444  

[53940 rows x 12 columns]

Now let’s have a look at the relationship between the size of a diamond and its price:

In [8]:
05001000150020002500300035004000020k40k60k80k100k120k140k160k
cutIdealPremiumGoodVery GoodFairsizeprice

Now let’s have a look at the prices of all the types of diamonds based on their colour:

In [9]:
IdealPremiumGoodFairVery Good05k10k15k
colorEIJHFGDcutprice

Now let’s have a look at the prices of all the types of diamonds based on their clarity:

In [15]:
IdealGoodPremiumVery GoodFair05k10k15k
claritySI2SI1VS1VS2VVS2VVS1I1IFcutprice

Type Markdown and LaTeX: α2

In [16]:
IdealGoodPremiumVery GoodFair05k10k15k
claritySI2SI1VS1VS2VVS2VVS1I1IFcutprice
In [ ]:
In [ ]:
In [ ]: